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ABSTRACT 


Artificial neural networks provide a new approach to commodity forecasting that 
does not require algorithm or rule development. Neural networks have been deemed 
successful in applications involving optimization, classification, identification, pattern 
recognition and time series forecasting. With the advent of user friendly, commercially 
available software packages that work in a spreadsheet environment, such as NeuralWorks 
Predict by NeuralWare, more people can take advantage of the power of artificial neural 
networks. This thesis provides an introduction to neural networks, and reviews two recent 
studies of forecasting commodities prices. This Study also develops a neural network 
model using NeuralWorks Predict that forecasts jet fuel prices for the Defense Fuel 
Supply Center (DFSC). In addition, the results developed are compared to the output of 
an econometric regression model, specifically, the Department of Energy’s Short-Term 
Integrated Forecasting System (STIFS) model. The Predict artificial neural network 
model produced more accurate results and reduced the contribution of outliers more 


effectively than the STIFS model, thus producing a more robust model. 
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THESIS DISCLAIMER 


The reader is cautioned that computer programs developed in this research may not 
have been exercised for all cases of interest. While every effort has been made, within 
the time available, to ensure that the programs are free of computational and logic errors, 
they cannot be considered validated. Any application of these programs without 


additional verification is at the risk of the user. 
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EXECUTIVE SUMMARY 


Neuralcomputing is one of the first alternatives to programmed computing. 
Programmed computing involves devising an algorithm and/or a set of rules for solving 
the problem and then correctly coding these decisions in software. However, 
programmed computing can only be applied in cases that can be described by a known 
procedure or set of rules. 

Neuralcomputing provides a new approach to information processing that does 
not require algorithm or rule development. This significantly reduces the quantity of 
software that must be developed and allows, for some types of problems, the 
development of information processing capabilities from which algorithms or rules are 
not known or are too expensive, time consuming, or inconvenient to develop. The 
primary information processing structure in neuralcomputing is an artificial neural 
network. (Hecht-Nielsen, p. 2) 

Neural network research is one of the most active areas in the world of 
management science today. Neural networks have been deemed successful in 
applications involving optimization, classification, identification, pattern recognition and 
time series forecasting. 

This study examines the commodity of jet fuel and provides a background 
knowledge of the Defense Fuel Supply Center (DFSC) and the Defense Logistics 
Agency. Then the jet fuel equation of the Department of Energy’s Short-Term Integrated 
Forecasting System (STIFS) model is introduced. An introduction to neural networks is 
provided, and two recent studies of forecasting commodities prices are reviewed. This 
study also develops a neural network model that forecasts jet fuel prices for the DFSC 
using Neural Works Predict. In addition, the results developed are compared to the output 
of an econometric regression model, specifically, the Department of Energy’s Short-Term 
Integrated Forecasting System model. 

The study addresses and answers three questions, namely: Can jet fuel prices be 
adequately predicted with a neural network model? Yes, it is possible to build a 


statistically sound artificial neural network with a commercially available software 


Xi 








package such as NeuralWorks Predict and obtain more accurate results than with a 
conventional modeling approach such as regression. The Predict artificial neural network 
model reduced the contribution of outliers more effectively than the STIFS regression 
model, thus producing a more robust model. 

Would an artificial neural network model provide better forecasting results than 
more common approaches such as an econometric regression model specifically, the 
Department of Energy's Short Term Integrated Forecasting System (STIFS) model? Yes, 
the artificial neural network model provided convincing results, outperforming the STIFS 
regression model in five out of six areas of measured effectiveness over a twelve year 
period using monthly data. The NeuralWorks Predict model yielded a better coefficient 
of determination, mean squared error, mean absolute percent error, mean absolute 
deviation and maximum absolute error. 

Would an artificial neural network model provide a useful planning and decision 
aid for the Defense Fuel Supply Center (DFSC)? Yes, with the advent of user friendly 
commercially available software packages such as NeuralWorks Predict, DFSC would 
benefit from the further investigation of artificial neural networks in forecasting noisy 
data sets such as fuel. By reducing the error of the forecasts, better budgetary decisions 
may be made. Today’s software applications are designed to work in commonly used 


spreadsheet environments. 


XIV 





I. INTRODUCTION 


A. MOTIVATION 


Neuralcomputing is one of the first alternatives to programmed computing. 
Programmed computing involves devising an algorithm and/or a set of rules for solving 
the problem and then correctly coding these decisions in software. However, 
programmed computing can only be applied in cases that can be described by a known 
procedure or set of rules. 

Neuralcomputing provides a new approach to information processing that does 
not require algorithm or rule development. This significantly reduces the quantity of 
software that must be developed and allows, for some types of problems, the 
development of information processing capabilities from which algorithms or rules are 
not known or are too expensive, time consuming, or inconvenient to develop. The 
primary information processing structure in neuralcomputing is an artificial neural 
network. (Hecht-Nielsen, p. 2) 

Neural network research is one of the most active areas in the world of 
management science today. Neural networks have been deemed successful in 
applications involving optimization, classification, identification, pattern recognition and 


time series forecasting. 


B. OBJECTIVE 


The questions this thesis explores and answers are: 


1. Primary Research Question 


Can jet fuel prices be adequately predicted with a neural network model? 


2. Subsidiary Research Questions 


Would an artificial neural network model provide better forecasting results than 
more common approaches such as an econometric regression model, specifically, the 


Department of Energy’s Short Term Integrated Forecasting System (STIFS) model? 











Would an artificial neural network model provide a useful planning and decision 


aid for the Defense Fuel Supply Center (DF SC)? 


C. PREVIEW 


In order to adequately answer these questions, the researcher first examines the 
commodity of oil in Chapter II. Factors that affect jet fuel prices are then discussed. 
Background information is provided both on DFSC and the Defense Logistics Agency 
(DLA) pertaining to the magnitude of the jet fuel forecasting problem. DFSC’s annual 
requirements are listed and a review of the current contracting practices are presented. 
Finally, how DFSC predicts jet fuel prices currently is shown. 

Chapter III presents the Department of Energy’s Short Term Integrated 
Forecasting System and isolates the single equation that predicts jet fuel prices. 

Chapter IV is intended as a primer for those unfamiliar with artificial neural 
networks. The structure of the most common architecture, the feedforward multilayer 
perceptron network of backpropagation solution algorithm, is outlined. Also two 
examples of recent research are depicted, namely, Grudnitski and Osburn’s Gold Futures 
Model and Homaee’s Defense National Stockpile Center model. The latter documents 
current neural network research conducted by a DLA organization. 

Chapter V presents this study’s neural network model that predicts jet fuel prices 
and the conduit used, namely, NeuralWare NeuralWorks Predict. Chapter VI details the 
measures of effectiveness used in this study and provides a comparison of the STIFS 
model! with the Predict model. Finally, answers to the research questions, areas of further 


research and recommendations are addressed in Chapter VII. 











If. JET FUEL AND THE DEFENSE FUEL SUPPLY CENTER 


This chapter discusses some of the oil industry issues that result in unique 
management concerns. It also provides an overview of DFSC’s current organizational 


structure and management perspective. 


A. THE COMMODITY OF OIL AND ITS UNIQUENESS 


1. The Uniqueness of Oil 


Oil is the only commodity that controls the industrialized world. The control of 
oil or access to it enables nations to accumulate wealth, to fuel their economies, to 
produce and sell goods and services, to build, to buy, to move, to acquire and 
manufacture weapons, and to win wars (Yergin, p.777). Another unique quality of oil is 
that crude oil itself is a commodity with very few direct uses. Virtually all crude oil is 
processed in a refinery to produce useful products like motor gasoline (Mogas), jet fuel, 
heating oil and industrial fuel oil. Today’s refinery is often a large, complex, 


sophisticated, and expensive manufacturing facility. (Yergin, p. 788) 


2. Crude Oil Defined and its Characteristics 


Crude oil is a mixture of hydrocarbons that exists as a liquid in natural 
underground reservoirs and is the raw material which is refined into gasoline, heating oil, 
jet fuel, propane, petrochemicals and other useful products (NYMEX, pp. 9-10). Jet fuel 
is a high-quality kerosene product used primarily as fuel for commercial turbojet and 
turboprop aircraft engines (NYMEX, p.18). Heating oil (or Number 2 fuel) is a light 
distillate oil used for home heating, in compression ignition engines and in light industrial 


applications (NYMEX, p.16). 


3. Jet Fuel and its Relationship to Heating Oil 


Heating oil and jet fuel have an interesting relationship. They are both 


categorized as light distillates and are formed from heavy oils by a chemical process 











called hydrocracking. Within a refinery, production of the distillates has a substitution 
relationship. That is, as the production of one distillate is increased, the production of the 
other is decreased by the same amount. Heating oil has a seasonal demand. During cold 
weather and unexpected or unusual cold periods, the increase in demand can result in a 
higher usage rate than normal for heating oil consumption. Refineries that operate at 
maximum capacity can not react to this additional demand because of the substitution 
relationship in production. Consequently, the decreased production of jet fuel and the 
increased production of heating oil may result in price increases for each product. Thus, 


heating oil shortages due to severe weather can very well cripple jet fuel production. 


4. What Else Can Influence Jet Fuel Prices? 


Many events that affect crude oil prices can also affect jet fuel prices. The “Iron 
Law” of energy and economic growth Suggests that there is an “inevitably and 
inescapably close relationship between economic growth rates and the growth rates for 
energy and oil use. For instance, if the economy grew at 3 or 4 percent a year, as was 
generally presumed, oil demand would also grow by 3 or 4 percent a year. Income was 
the main determinant of energy and oil consumption.” (Yergin, p.671) 

Prior to 1973, the need for oil price forecasting was not necessary. Price changes 
had been measured in cents, not dollars, and for many years prices were more or less flat. 
The United States produced most of the oil needed for domestic consumption. However, 
by 1973 the United States yielded a smaller percentage of world oil production. United 
States crude oil prices became more volatile because United States oj] production output 
quantities remained relatively constant and oil demands of industrialized countries 
including the U. S. were increasing. Because of the United States’ increased dependence 
on foreign oil, an inability to control its price resulted. The repercussions of changing, 
volatile crude oil prices were not only of interest to the energy industries, but also to 
consumers of the refined products used by airlines and other transportation providers. 
(Yergin, p. 671) 

Consequently, oil analysts generally suggest that relationships exist between jet 


fuel prices and the following phenomena: past jet fuel prices, crude oil prices, heating oil 





prices, gasoline prices, the current demand and supply of heating oil, political events, 


weather, and natural disasters. 


B. THE DEFENSE FUEL SUPPLY CENTER (DFSC) 


1. Primary Role of DFSC 


The Defense Logistics Agency (DLA) is tasked with supplying all fuel 
requirements for the Department of Defense. DLA oversees six Inventory Control Points 
(ICPs) that each specialize in different types of commodities. The Defense Fuel Supply 
Center (DFSC) is the ICP which purchases all of the fuels, oils, and lubricants for the 
Department of Defense and many other federal agencies. DFSC has a world-wide 
mission to buy, distribute, maintain, and account for all petroleum products in its 
iventonn 

Petroleum products are unique within the Defense Logistics Agency (DLA) 
because petroleum products have high consumption rates and DLA has limited storage 
capacity for petroleum products. The annual amount of money spent on crude oil 
products makes fuel the most expensive support item procured by the Department of 
Defense. (Elkins, p.1) 


2. DFSC Annual Requirements 


DFSC is the largest single customer for petroleum products in the world. Its 
annual fuel bill varies between $4 and $10 billion depending on market conditions and 
DOD needs (Hart, p.8). Over 70% of DFSC’s purchases are for jet fuel. DFSC’s largest 
jet fuel customer is the Air Force with sales that constitute approximately 55% of total 
contract obligations. The Navy is the second largest jet fuel consumer with sales totaling 
about 20% of all obligations. Table 2-1 provides a breakdown of total barrels purchased 
for each class of petroleum products managed. (DFSC, 1992, 1992, 1993, 1994 , p. 8) 

AVGAS refers to all aviation gasolines besides kerosene jet fuel. Kerosene jet 


fuel used by DFSC include: JP-4, JP-5, and JP-6. Motor gas is used for motor vehicles. 











Distillate is used as heating oil. Residuals are the remains after refining and have limited 


uSes. 


BARRELS PURCHASED (IN MILLIONS) 


Dollars per Barrel |$33.14/$28.78|$ 19.321$21.42/$20.611$25.721$32.53 {$26.77 [$27.13 





TABLE 2-1. Total Barrels Purchased for each Class of Petroleum Products Managed. 
(DFSC, 1991, 1992, 1993, 1994, p.8). 


The Department of Defense downsizing and the decrease in the number of 
customers that DFSC must support means that DFSC needs to improve their cost 
predictions. DFSC is the only ICP that cannot accurately predict its budget requirements. 
DFCS does not take physical possession of the quantity of fuel needed to meet long range 
demands because the transportation and holding costs for large volumes do not make it 
cost effective. On a dollar per barrel ($/barrel) basis, fuel is a volumetrically low priced 
commodity. Massive distribution points needed to accomodate long range demands are 
avoided because the transportation costs related to the subsequent distribution of the fuel 
are so high. Since the demand for fuel far exceeds available storage capacity, neglecting 


Prepositioned War Reserves (PWR), DFSC is held hostage to the volatile market. 


3. How Does DFSC Purchase Fuel Now and How Do They Pay For It? 


DFSC purchases fuel by issuing contracts. Government fuel contracts are 
typically written for one or two year periods. These contracts are either based on firm 


requirements or are left indefinite as to quantity with only minimum and maximum 








quantities specified. Contracts usually result in equal monthly deliveries over the period 
of the agreement. Contract base prices are established at the time of contract award. 
Bulk contracts are delivery orders for major refineries where DFSC buys directly from 
refiners, takes claim of the fuel at or near the refinery, and arranges for delivery of the 
fuel. Local contracts, which compose approximately thirty percent of all contracts, 
purchase fuel for military installations across the country on a delivery basis. Local 
contracts differ from bulk contracts in that the contracts are not negotiated, but are 
awarded to the lowest bidder. One year contracts are negotiated for most bulk contracts, 
and smaller local fuel supply contracts are negotiated for two years. 

Because petroleum prices can be extremely volatile, a price adjustment clause is a 
necessity. A price adjustment clause allows the government and its fuel suppliers to 
share the risk of market volatility. Bulk contracts are price adjusted monthly with price 
data found in the Petroleum Marketing Monthly, a monthly listing of all types of fuel 
products. Local contracts are adjusted weekly with data found in the Oil Price 
Information Service or The Lundberg Letter which are similar weekly publications. Since 
DOD purchases certain fuels for military use, such as JP-5 which is used for high 
performance combat aircraft, and there exists no civilian market counterpart, price indices 


for the most similar alternative commercial product is used. 


4. How Does DFSC Predict Fuel Prices Now? 


DFSC relies heavily on the Department of Energy (DOE) for price predictions. 
DOE has developed the Short Term Integrated Forecasting System (STIFS) model to 
simulate the United States economy with its fuel supply, demand and price structure. The 
STIFS model is the subject of Chapter III and will be discussed in detail later. The input 
data for the STIFS model comes from the Petroleum Marketing Monthly which contains 
the actual selling prices of commodities in specific regions. Additional data sources are 


published by the Bureau of Labor Statistics. 








Il. THE DEPARTMENT OF ENERGY’S SHORT-TERM INTEGRATED 
FORECASTING SYSTEM (STIFS) MODEL 


This chapter was based on material found in the Short-Term Integrated 
Forecasting System (STIFS): 1993 Model Documentation Report (DOE) and verbal and 
written correspondance with Department of Energy analysts Neil Gamson (Gamson) and 


Michael Morris (Morris). 


A. INTRODUCTION TO THE SHORT-TERM INTEGRATED F ORECASTING 
SYSTEM 


1. Model Overview 


The Energy Information Administration (EIA) of the U.S. Energy Department 
(DOE) developed the STIFS model to generate short-term (where short-term is defined as 
up to and including 8 quarters) monthly forecasts of U.S. supplies, demands, imports, 
exports, stocks, and prices of eight major forms of energy. These products are motor 
gasoline, distillate fuel oil, residual fuel oil, jet fuel, liquefied petroleum gases, other 
petroleum products, natural gas, electricity and coal. Inputs to STIFS consist of historical 
data and forecasts that relate production, demand, imports, exports and stocks of both 
primary and end-use energy sources. Historical data comes primarily from the Integrated 
Modeling Data System (IMDS), an in-house EIA electronic database. IMDS data is 
extracted from data reported regularly in EIA publications such as the Petroleum 
Marketing Monthly. Thus the model runs on monthly data aggregated to the national or 
total industry level. 

With STIFS, the user can simulate a variety of energy-market conditions that 
affect the projections of energy supply, demand, and prices by altering certain 
assumptions. STIFS is generally used as a policy and management tool to simulate 
changes to energy tax policy, energy regulations or world oil prices. STIFS is the 
integrated system which develops Supply and demand forecasts that are published 
quarterly in the Short-Term Energy Outlook. DFSC reviews the Short-Term Energy 


Outlook for insight when generating budgetary requirements. 











2. General Modeling Approach and Basic Assumptions 


STIFS is a collection of single equations formulated to forecast short-run 
variations in key energy quantity and price concepts which are reported routinely by EIA. 
STIFS makes several assumptions on short run energy demand fluctuations. First, 
production is demand driven. Secondly, monthly energy demand may be modeled by 
linear regression. Energy demand is the demand for energy products resulting from the 
collective demand for energy services (such as heating, cooling, lighting, personal travel, 
etc.) or the demand for energy inputs by industry in manufacturing or other industrial or 
commercial activities. Thirdly, domestic energy sources are assumed to be utilized first, 
with foreign sources assumed to be the source of energy supply once domestic capacity 
limits are reached. Finally, imports are expected to be significantly more important once 


domestic capacity constraints (such as refinery capacity) are approached. 


3. Statistical and Data Overview 


The STIFS model consists of 305 equations, of which 93 are estimated. The 93 
estimated equations are linear regression equations that together form a system of 
interrelated equations. However, this study is only interested in the structural links to the 
jet fuel price equation. It should be noted that in estimation, STIFS generally handles the 
Separate equations one at a time, often with varying periods of estimation for different 
variables. Nevertheless, numerous simultaneities exist in the model, and the model’s 
solution algorithm provides a dynamic simultaneous solution. The general method of 


estimation is ordinary least squares fit. 


B. MATHEMATICAL SPECIFICATIONS 


1. Jet Fuel Price Equation 


The price of jet fuel is estimated using the linear regression equation: 
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where P;., is the average retail price of kerosene jet, Bo is the constant regression 
coefficient, R, is the regression coefficient of jet fuel, Pye) 18 the average retail price of 
kerosene jet fuel lagged one month, P; is the coefficient of crude O11, Pode is the price of 
crude oil, Wy is the coefficient of the wholesale price index, Ipp is the wholesale price 
index for non-energy products as a measure of inflation, D, is the coefficent for the ratio 
of last month’s jet fuel supply divided by this month’s jet fuel demand, S,,,.; is last 
month’s jet fuel supply, which divided by D,,,, the current month’s demand, results in the 
projected month’s usage, D, is the coefficient for the dummy variable, C4,,, is a dummy 
binary variable representing the period of December 1989 through January 1990, when 
cold weather caused all petroleum product prices to surge. 

Equation 3-1 calculates the price of jet fuel in a linear regression equation as a 
function of the previous month’s jet fuel price, the current price of crude oil, and the 
previous month’s supply of jet fuel divided by the current month’s demand of jet fuel. 


The producer price index less food and energy is used as an economic indicator because 


jet fuel usage declines in periods of economic distress. 














IV. AN INTRODUCTION TO ARTIFICIAL NEURAL NETWORKS AND HOW 
THEY DIFFER FROM MORE TRADITIONAL METHODS 


The material contained in section B of this chapter was compiled from several 
sources. Specifically, Artificial Neural Systems (Simpson), Neural Networks: A Primer 
(Wiggins), Neurocomputing (Hecht-Nielsen), NeuralWorks Predict 1.0 User’s Manual 
(Predict) and Neural Networks: An Introduction (Muller & Reinhardt). The material 
contained in section C was compiled from “Forecasting S&P and Gold Futures Prices: An 
Application of Neural Networks” (Grudnitski). The material contained in section D was 
compiled from, “Applying Backpropagation and general regression neural networks to 


forecast commodity prices for the Defense National Stockpile Center” (Homaee). 


A. WHAT IS A NEURAL NETWORK? 


1. Why Neural Networks Are Being Rediscovered 


From the output of the first useful electronic digital computer, all information 
processing applications utilized programmed computing. A problem was defined, 
parameters and constraints were specified, an algorithm was determined, and then the 
known information was coded into software. However, this method only provided 
solutions to problems that could be fully described. These computers were unable to 
handle problems which were not enumerated. The logical basis of computers caused 
these computers to produce inaccurate solutions if the software was not essentially 
perfect. Neural networks provide a means for computers to handle unenumerated 
problems. 

Although the first formal models of neural networks were designed in the 1940’s, 
it was not until the 1980’s that a renewed interest was generated in neural networks. 
Three factors assisted this resurgence. First, the field gained credibility through research 
performed by physicists. These scientists injected more rigor into the field by 
approaching the subject from a more scientific and analytical stance. Secondly, new and 
more powerful network architectures such as multilayer perceptrons using 


backpropagation algorithms were discovered or rediscovered. Finally and most 








importantly, the availability of less expensive and more powerful computers allowed 


widespread experimentation with neural network techniques. 


2. Neural Network Structure 


A neural network is a parallel distributed information processing structure in the 
form of a directed graph. The nodes of the graph are commonly called processing 
elements. The arcs of the graph are called connections. An adjustable value called a 
weight is associated with each connected pair of processing elements. The weight, w,,, 
represents the strength of the connection. The processing elements are organized into 
layers with full or random connections between successive layers. Nodes 1n the input 
layer receive input, and nodes in the output layer provide output. Nodes in the middle 
layers receive signals from the input nodes and pass signals to output nodes. The value 
entering a processing element is typically the sum of each incoming value multiplied by 
its respective connection weight. This is often referred to as internal activation or a 
summation function, and is expressed as I, in Figure 4-1. The internal activation is then 
modified by a threshold function, F(I), which determines the strength of the output 
connection. The modified signal will be transmitted to other nodes in the next connected 
layer which in turn may produce the input to one or more processing elements in 
subsequent layers. Because the output of the middle nodes is not directly observable, the 
middle layers can be thought of as hidden. Each processing element may have any 
number of incoming or outgoing connections but the output signals, y;, from node j must 
all be the same. 

Neural networks build models based on historical data. The connection weights 
and threshold values developed by the model are then applied to a new data set. This 
process is analogous to fitting a regression model based on past data and then utilizing the 
data for prediction. Both techniques require the identification and categorization of both 
the input and the output, i.e. is it binary, continous, cardinal, or some other form? The 
major difference that exists is that the regression model requires specification of an exact 
functional model. Although the number of processing elements and layers in the neural 


network determine the complexity of the relationships that the network can capture, this 


is not as stringent a task as the development of a specific functional form. (Wiggins, p. 
28) 

Regression analysis and neural network techniques also require the estimation or 
training of the model. In both cases, it is common to validate the resulting model against 
data not used during estimation or training. However, in the case of regression analysis, 
it is usually possible to evaluate the statistical significance of the estimated parameters, 
assuming the errors follow some specified distribution. Thus, the primary differences 
between regression and neural networks are the inherent flexibility of a neural network, 
and the inability in general, to test the statistical significance of a neural network model. 
(Wiggins, p. 28) 

Like regression, the most popular fitting criterion for neural networks is 
minimization of the squared errors, but individual values rather than their sums are 
examined. Neural networks are applicable in any situation where there is an unknown 
relationship between a set of input factors and an outcome, and for which a representative 
set of historical examples of this unknown mapping is available. The objective of 
building a model is to find a formula or program that facilitates predicting the outcome 
from the input factors. (Predict, p. 1-2) 
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Figure 4-1. Neural Network Processing Element. 











3. Backpropagation 


The primary application of backpropagation is the solving of complex, non- 
linearly separable problems. The backpropagation algorithm is the most common method 
of adjusting the weights of a multilayer artificial neural network. One third of current 
research and almost three quarters of current applications utilize this algorithm (Wiggins, 
p. 17) 

The goal of backpropagation is to minimize the squared error of the predictions 
over all of the observations. In backpropagation, the output error is assumed to be 
collectively contributed by all connection weights. Weights normally commence the 
training process as small random values. Figure 4-2 shows a three layer feedforward 
network. The input weights are designated as an’. The interlayer weights are designated 
as V,; and w,;. The output weights are illustrated as cr The processing elements are 
shown as a, , b; or c; if the processing element is topographically located in the input, 
hidden or output layer, respectively. The summation of the processing elements per layer 
is represented by F,, Fg and F, corresponding to the input, hidden and output layers. The 
threshold function value for each Fz processing element connection is designated as G,, 
and the threshold function value for each Fc processing element connection is designated 
as I}. The spatial patterns, which are each a single possible path through the network, are 
represented as vector pattern pairs (A,, C,), k= 1, 2, ...,m. Each pattern pair represents a 
path through the network. very iteration of network training utilizing the 
backpropagation algorithm consists of two sweeps through the network. The first sweep 
Starts with the input to the network’s input layer. The processing elements of the input 
layer transmit all of the components to the hidden layer. The outputs of the hidden layer 
are then transmitted to the output layer. After the estimate is emitted from the network, 
each output layer processing element is supplied with its component of correct output 
and then the error between actual and estimate is computed. Then the backward sweep 
begins. The output layer processing elements adjust their threshold value error to more 
closely match the actual output. Next, the hidden layer processing elements adjust their 


weights based on the new output weights. Finally, the input processing elements adjust 














their input weights based on the input from the other two layers. This recursive process 
concludes when the output error converges to within an acceptable tolerance defined by 
the user. A disadvantage of backpropagation is the sometimes lengthy convergence time. 
The possibility also exists that a network will never converge. 

The most concise explanation of the backpropagation algorithm found during the 
course of research is contained in Simpson (1989, p. 114-115). It describes the objective 
function as a cost function that is minimized by making weight connection adjustments 
according to the error between the computed and desired output (F-) processing element 
values. The cost function that is minimized is the squared error, which is the squared 


difference between the computed output value and the desired output value for each F. 


processing element summed across all paths in the data set. 





Figure 4-2. A Three Layer Feedforward Network. 











The weight adjustment procedure is derived by computing the change in the cost 
function with respect to the change in each weight. What makes this paradigm so 
powerful is that this derivation is extended to find the equation for adapting the 
connections between the input (F,) and hidden (F p) layers of a multilayer artificial neural 
network, as well as the next to the last layer (the last hidden layer) to output layer 
adjustments. The key element of the extension to the hidden layer adjustments is the 
realization that each Fy processing element’s error is the proportionally weighted sum of 
the errors produced at the F¢ layer. 

Simpson outlines this algorithm using a three step process for the three-layer 
topology illustrated in Figure 4-2: 

I. Assign random values in the range [+1, -1] to all the F A ~to- Fz inter-layer 
connections, Vp; , all the Fg -to- Fc inter-layer connections, Win » to each Fp, processing 
element threshold function value, 9, , and to each Fo processing element threshold 
function value, ae 

2. For each pattern pair (A;, C, ), k=1, 2, ...,m, do the following: 

a. Transfer vector A,’s values to the F, processing elements, filter the F a 
processing element activations through V and calculate the new Fx processing element 


values (activations) using the equation 


6, = apy Vai ‘8, (4-1) 


for all i = 1, 2, ..., p, where b, is the activation value of the ith F p processing element, $, 
is the ith Fy processing element’s threshold value, and f() is the logistic sigmoid threshold 
function f(x) =(1 + e*y". 


b. Filter the Fy activations through W to F, using the equation 


c= apy. w, r,] (4-2) 


for all } = 1, 2, ..., g, where c; is the activation value of the jth F, processing element and 


I; is the jth Fg processing element’s threshold value. 


c. Compute the discrepancy (error) between the computed and desired F, 


processing element values using the equation 














d= ei(I-c, (e}-<, (4-3) 
for all j = 1, 2, ..., q, where dj is the jth F, processing element’s computed error. 
d. Calculate the error of each Fp processing element relative to each d; 


with the equation 


q 
e, = b,(1-5,)) wd, (4-4) 


j=l 
for all i = 1, 2, ..., p, where e; is the ith Fy processing element’s computed error. 
e. Adjust the F, to F- connections 
Aw, =ab,d, (4-5) 
for alli=1, 2, ..., p, and all j = 1, 2, ..., g, where Aw, is the amount of change made to the 
connection from the ith Fg to the jth Fc processing element, and a is a positive constant 
representing the learning rate. 
f. Adjust the F, thresholds 
AT’; =ad, (4-6) 
for all j= 1, 2, ..., q, where AT ; 1s the amount of change to the jth Fc processing element’s 


threshold value. 
g. Adjust the F, to Fg connections 
Av,; = Ba,e; (4-7) 
for all h = 1, 2, ..., n, and all i = 1, 2, ...» Pp, Where Av,, is the amount of change made to 
the connection from the hth F, and ith F, processing element, and 8 is a positive constant 
controlling the learning rate. 
h. Adjust the F, thresholds 
AS, = Be, (4-8) 
for all 1 = 1, 2, ..., n, where AS ; 18 the amount of change to the ith Fo processing 
element’s threshold value. 
3. Repeat step (2) until the error correction value, d; , for each j= 1, 2, ..., p, and 
each k = 1, 2, ..., m, is either sufficiently low or zero. 


Backpropagation is not guaranteed to find the global minimum error during 


training, only the local minimum error. This is an area that is being further explored in 








research. Strengths include an ability to store many more patterns than the number of F A 
dimensions (m>n) and its ability to acquire complex nonlinear mappings. However, its 


major limitation is its extremely long training time. 


B. A NEURAL NETWORK APPLICATION TO A WALL STREET PROBLEM 


1. Gold Futures Model 


Grudnitski and Osburn (1993) examined the feasibility of utilizing neural 
networks to forecast monthly price changes of Standard & Poor’s (S&P) 500 Stock index 
futures market and the Commodity Exchange (COMEX) Incorporated’s gold futures 
market (gold) based on past price changes. The period December 1982 to September 
1990 was studied. The research contributed to the suggestion that the standard random 
walk assumption of futures prices may actually be only a veil of randomness that shrouds 
a noisy nonlinear process. Because of the proprietary nature of such studies and the 
cutthroat nature of futures markets, Grudnitski and Osburn’s study is one of the few 
published in this area. 

Grudnitski and Osburn presumed that two factors other than price trends are 
related to price movements of futures, namely, general economic conditions and traders’ 
expectations. They created two networks, an associative network and a forecasting 
network. The associative network decides if conducting a trade is advised. This is based 
on the similarity of the presented pattern to the components of the training set. The 
associative network grades the pattern, between 0 and 1, with | meaning that the pattern 
is identical to one in the training pattern, and 0 meaning that there is no similarity. If the 
grade is greater than 0.5, then the decision to trade is made. If the pattern did not appear 
before, then the decision to trade is not considered. The forecasting network then predicts 
the price change that is based on the price changes that took place to the network that it 


was similar to. 
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2. Input and Output 


Three distinct input variables were used for both the forecast and associative 
networks: 

1. Monthly growth rate of the aggregate supply of money, M-1, that was 
compiled from Barron’s. This was intended to represent an underlying economic factor 
that influences both the S&P and gold markets futures contract prices. 

2. The change in price and price volatility of S&P and gold futures prices. 
Volatility is defined as the market’s price range and movement within that range. The 
direction of the price move, whether up or down, is not relevant (NYMEX, p. 32). 

3. End of month net percentage commitments of large speculators, large hedgers, 
and small traders. Net percentage commitments are the net, long minus short, positions 
of trading groups divided by the total open interest in the future. The positions of the 
three types of trading groups are compiled monthly by the Commodity Futures Trading 
Commission. Open interest or commitment is defined as the number of open or 
outstanding contracts for which an individual or entity is obligated to the Exchange 
because that individual or entity has not yet made an offsetting sale or purchase, an actual 
contract delivery, or in the case of options, exercised the option (NYMEX, p. 23). 

For the forecasting network, there is only one output, the change of the mean for 
the forecasted month. The associative networks assessed the quality of the forecast as an 
output matrix. The output matrix consisted of all zeroes and a one, where the one’s 
location corresponded to an individual input pattern. This was done because neural 
networks can only process information, make data transformations, and detect patterns. 
They cannot fabricate an answer from which there is no learning. Where no information 
exists, a neural network cannot manufacture meaning. Using the derived weights from 
training, the associative network will produce a value between 0 and 1, where 0 
represents complete dissimilarity and 1 represents perfect similarity for each of the output 


nodes. 
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3. Network Description 


The input nodes represent six input parameters per month: price change 
(measured in dollars), volatility (also measured in dollars), three trader sentiment 
percentages, and M-1. The input parameters are grouped four months at a time thereby 
providing 24 input nodes. The forecast network architecture consists of 24 input nodes, 
two hidden layers with the first hidden layer consisting of 24 nodes and 8 nodes in the 
second hidden layer. The output layer consists of one node. The similarity network 


consists of 24 input nodes, one hidden layer of 24 nodes and 15 output nodes. 


4. Network Training 


Grudnitski and Osburn felt that the most important decision to be made is to 
establish the duration of the training period. The tradeoffs that are involved in 
developing training period duration are providing enough training patterns for adequate 
learning to take place versus a desire to test the ability of the network to generalize during 
bullish, bearish, and trendless market periods that characterize a business cycle. To find a 
feasible training set size, the 90 periods of data was divided into training sets of 30, 45, 
and 60 months duration. Then the training sets were evaluated for similarity, an output of 
the similarity network. Test patterns exceeding similarity values of 0.5 to any training 
pattern are assessed as being similar. The largest partitioning of unique patterns occurred 
in groupings of 15 months because that was the largest pattern noted without duplication. 
The networks were adaptively trained from patterns representing the most recent 15 
months of data. On each iteration through the complete training set, the parameters of the 
networks were modified to minimize the average sum of squared errors between the 


target values and the calculated values of the training set. 


5. Results 


The measure of effectiveness of the neural networks was assessed in a two step 
process. The first question that was asked was “Should I trade?” The answer to that 
question was “yes” if the similarity matrix yielded a similarity rating that exceeded 0.5 


and “no” otherwise. Grudnitski and Osburn’s study yielded 45 “yes” answers for S&P 
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and 51 “yes” answers for the gold data out of a total of 75 possible decisions. The next 
discriminating factor was asked, “Does the sign of the actual price changes of all similar 
training patterns agree?” If “yes”, then a trade was simulated. There were 41 “yes” 
occurrences for both the S&P and gold data. 

The results of the trade simulations determined whether the next month’s mean 
would be positive or negative. The S&P trades were determined to match the actual 
mean direction and thus be “correct” 75% of the time, and the gold trades simulations that 
followed the actual direction results were correct 61% of the time. Grudnitski and 
Osburn attributed the differences in accuracy to gold’s price changes resembling a 
sawtooth curve more than the S&P data. 

The 41 trades of a S&P and gold futures contract results in an average per period 
(and cumulative) return on investment of 17.04% (698%) and 16.36% (670%). The 
comparable simple moving average forecasted for gold results in an average per period 
and cumulative return of investment of 2.88% and 118.13%. However, it was noted that 
these results were aided by trading selectively if a similarity pattern could be recognized 
by the neural network. If no similar pattern was recognized, no trade was performed. 
Thus with proper filtering of the data, profitable results can be realized more often than 


not. 


C. DOD APPLICATIONS 


1. DNSC MODEL 


The Defense National Stockpile Center (DNSC) is another Inventory Control! 
Point of the Defense Logistics Agency. DNSC manages commodities to ensure that the 
United States will have critical raw materials to support both military requirements and 
the U.S. economy during a war. This reserve of materials diminishes the United States’ 
dependence on foreign nations. 

DNSC recently performed a study examining if neural networks could be used to 
predict commodity prices more accurately than other standard statistical techniques. The 


Statistical techniques used for comparison purposes were: linear regression, multiple 











regression, and Brown’s exponential smoothing. The measures of effectiveness were the 
Mean Squared Error (MSE), Mean Absolute Error (MAE), and the coefficient of 


determination (R’). The metals selected for the study were: aluminum, cobalt, and nickel. 


a. Data, Inputs, and Outputs 
The data consisted of 209 observations (eighteen years) of monthly data. 


The sources were DNSC, the Bureau of Mines, and Economic Bulletin Boards sponsored 
by the Department of Commerce. The input layer consisted of seven input variables to 
the network. Specifically, the price of gold, the price of gold lagged one month, an 
inventory to sales ratio of the metal which served as an economic indicator illustrating the 
use and production of everyday items, the price of the metal, p, the price of metal lagged 
one month, the price of the metal lagged two months, and a ratio of the price of metal and 
the price of metal lagged one month. There was one hidden layer and the output layer 
consisted of six nodes representing the six monthly forecasted values of the commodity. 
A variety of neural network architectures are available. Several types 
were evaluated. The backpropagation and general regression neural networks were 
determined to be most suitable. Ten percent of the eleven years of data was randomly 
selected by the NeuroShell software utilized to designate the test set on these two neural 


networks. The remaining data was used for training the networks. 


b. Results 
Because the testing data set was approximately 10% of the data set and 


was deemed too small to use with the statistical methods, the training set, which consisted 
of approximately 90% of the data, was used to formulate the measures of effectiveness 
for the statistical methods. When using the coefficient of determination (R*) as a measure 
of effectiveness, only values greater than 0.80 were deemed as an acceptable fit to the 
data. Only the multiple regression model for aluminum produced an acceptable value 
(0.87). However, all the neural network models possessed acceptable R’ values 
(aluminum 0.99, cobalt 0.98, and nickel 0.99). In the evaluation of MSE and MAE 


computations, the neural network models had the least error in every case. 
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The results indicated that the neural network’s predictions were between 
8% and 100% better than the methods of Brown’s exponential smoothing, simple 
regression, and multiple regression. DNSC is actively trying to establish an operations 
research analyst position to maintain these neural networks and create other neural 


networks for other commodities managed. 
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V. THE MODEL 


This chapter provides a brief introduction to NeuralWorks Predict and describes 


the model that was built for this study. 


A. DATA SOURCE 


The data set was provided by Department of Energy Petroleum Demand Analyst 
Michael Morris. Morris compiled the historical data from the Integrated Modeling Data 
System (IMDS) electronic database. He generated results using the STIFS model 
discussed in Chapter III to compute STIFS jet fuel price predictions. Morris provided the 
identical data used for the STIFS projections in order that this researcher could present 
the same input values to the artificial neural network model to facilitate comparison. The 
period March 1982 to March 1994 was studied. The data provided included the United 
States average monthly jet fuel inventory, the United States monthly wholesale price of 
number two heating oil, the average monthly United States refiner’s acquisition cost for 
crude oil, the United States monthly retail price of jet fuel, the average monthly United 
States motor gasoline fuel price and the United States producer price index less energy 


and fuel. 


B. AN INTRODUCTION TO NEURALWORKS PREDICT 


This researcher was selected by Neural Ware to serve as a beta tester for a recently 
released product named NeuralWorks Predict. Predict is a software application that 
integrates all the components needed to apply neural computing to a wide variety of 
problems. It is different from other neural network software applications in that it 
automates much of the painstaking manipulation, selection, and data pruning that 
monopolizes most of the time in building a real world neural network application. These 
tasks include: data analysis and transformation, variable selection, network architecture 
and training and test set selection. 

The primary user interface to Predict is via MicroSoft Excel. This provides a 


familiar front end both for supplying data to and receiving results from the Predict 
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application. The Excel interface facilitates access to all parameters that control the 
various algorithms, and allows examination of the results of the trained model. An added 
benefit to utilizing the Excel environment is the flexibility and charting capabilities for 


further model analysis and the ability to build third party macros. (Predict, p. 1-1) 


1. Building a Neural Network Model in Predict 


Several steps take place when building a model in Predict. The first defines the 
System objective. Predict is capable of providing solutions to prediction, ranking and 
classification problems. This study utilized a prediction problem type. Next the user 
selects a learning rule for the data set. Predict supports two learning rules: adaptive 
gradient and Kalman filter. The adaptive gradient learning rule uses backpropagated 
gradient architecture to guide an iterative line search algorithm. Brent’s algorithm is used 
to search along that direction for a minimum of the objective function. The adaptive 
gradient process is repeated until a local minimum of the objective function has been 
found (Predict, p. 10-2). The Kalman filter learning rule considers the weights to be 
States and the desired outputs to be the observations within a discrete State space 
transition framework. If very noisy data is used, then the program selects the Kalman 
learning rule. (Predict, p. 5-11) This rule is especially effective for noisy behavioral 
problems because it possesses a built-in ability to suppress noise (Predict, p. 1-16). This 
study used moderately noisy data and thus the adaptive gradient approach. 

The software allows the user to chose a type of data analysis and transformation 
level, or it will choose the default setting. The data analysis examines each data field and 
determines the type of field and the types of transformation that will convert the field for 
effective use by the neural network. A higher analysis type will work harder to find good 
transformations, and may create more transformations per field (Predict, p. 4-15). The 
user has the choice of selecting: scale data only, superficial data transformation, moderate 
data transformation, or comprehensive data transformation. The comprehensive data 
transformation setting was used. 

Picking the right input variables is critical to effective model development. A 


good subset of variables can substantially improve the performance of a model. The 


28 





variable selection component of Predict determines which set of fields and 
transformations work well together for predicting the output (Predict, p. 4-15). Predict 
utilizes a genetic algorithm to search for good sets of input variables as created by the 
Data Analysis and Transformation component. For each possible set, a network is 
developed, and the performance of the network is used to rank the subset of inputs. The 
levels available to the user are: no variable selection, superficial variable selection, 
moderate variable selection, comprehensive variable selection, or exhaustive variable 
selection. This study used the comprehensive variable selection setting (Predict, p. 4-18). 

The next selection made is the neural network search level. This feature allows 
the user to specify how hard Predict works in building the model. The higher the level, 
the more time consuming the search. The levels available to the user are: no network 
search, superficial network search, moderate network search, comprehensive network 
search, or exhaustive network search (Predict, p. 4-21). A comprehensive network search 
was used. 

Predict allows the option of training several networks rather than just one. This is 
important because the first network trained is not necessarily the best model. To identify 
the best model, different combinations of the number of input processing elements or the 
number of hidden processing elements are examined. By trial and error, it was 
determined that at least five networks but no more than ten networks would be trained. 
Predict includes two features referred to as patience and tolerance. The patience level in 
Predict refers to the improvement of fitness within the tolerance for this number of 
iterations. The tolerance refers to the meaningful improvement in fitness of the model. 
The maximum number of iterations may not be achieved if the test performance of the 
networks does not improve by more than the tolerance value at the patience level 
specified for successive networks. The best performing network is retained at the end. 
(Predict, p. 5-14) 

Another important step in model development within Predict is the selection of 
training, testing, and validation sets. The purpose of developing a neural network model 
is to produce a formula that captures essential relationships in data. Once developed, this 


formula is used to interpolate from a new set of inputs to corresponding outputs. In 
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neural networks, this is called generalization. The training set is the set of data points 
that are used to fit the parameters of the model. The test set measures how well the 
model interpolates. It is used as part of the model building process to prevent over- 
fitting. The validation set is used to estimate model performance in a deployed 
environment. (Predict, p. 1-8) 

When the user completes the dialog to build and train the model, the model is 
built and trained. At the conclusion of training, the user “runs the network” and the 


predictions are written to an area of the spreadsheet designated by the user. 


C. THE PREDICT MODEL FOR GENERATING JET FUEL PRICES 


1. Inputs Presented to the Network 


This study presented 145 observations to the software. A 486 / 66 Mhz personal 
computer with Microsoft MS-DOS 6.2 operating system, Microsoft Windows 3.1, 
Microsoft Office 4.2, Microsoft Excel 5.0a and NeuralWare Neural Works Predict A04 
beta release builds and trains the artificial neural network, using the settings described 
above, in approximately 3 hours and 42 minutes. Predict has a feature that the model can 
be trained and built in the background thus allowing multi-tasking. The 145 observations 
were partitioned into a training set that comprised 70% of the data. The remaining 30% 
became the test set. The validation set utilized all of the data. The eight inputs to the 
model were: 

a. A ratio of the jet fuel supply lagged one month and the current jet fuel demand 
which produced an inventory value for the United States; 

b. A ratio of the jet fuel supply lagged one month and the current jet fuel demand 
which produced an inventory value for the United States Same lagged one month; 

c. Number two heating oil wholesale price for the current month; 

d. Number two heating oil wholesale price lagged one month; 

e. United States refiner’s acquisition cost for crude oil; 

f. United States refiner’s acquisition cost for crude oil lagged one month; 


g. Price of kerosene based jet fuel lagged one month: 
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h. Price of kerosene based jet fuel lagged two months; 

The variable selection feature of Predict chose six input transformations that 
originated from three input fields of each octuplet observation. The three fields selected 
were: the number two heating oil wholesale price for the current month, the price of 
kerosene based jet fuel lagged one month and the price of kerosene based jet fuel lagged 
two months. 

These six input transformation nodes form the input layer. The input layer is fully 
connected to the hidden layer which has sixteen nodes. The hidden layer is then 
connected to the output layer. The output layer has only one element, the output value, 


which is the predicted price of jet fuel. 
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VI. A COMPARISION OF THE DEPARTMENT OF ENERGY’S STIFS MODEL 
AND AN ARTIFICIAL NEURAL NETWORK MODEL 


This chapter examines the output of the Department of Energy’s STIFS model 
and compares the results to the predictions made by the Neural Works Predict model. The 
sources for the definitions for the measures of effectiveness are found in Principles of 
Inventory and Materials Management (Tersine, pp. 40-43) and Econometric Analysis 


(Greene, p. 192). 


A. MEASURES OF EFFECTIVENESS USED IN THIS STUDY 


The presence of randomness precludes a perfect forecast. (Tersine, p. 40) 
Therefore, statistical computations that measure the size of the error may be beneficial in 
evaluating the forecasting techniques used in this study. The measures of effectiveness 
(MOEs) used in this study are the: coefficient of determination (R*), mean squared error, 
mean absolute percent of error, mean absolute deviation, minimum absolute error, and 
maximum absolute error. (In the equations, the yj indicates the actual value, 7, indicates 


the estimated value.) 


1. Coefficient of Determination (R’) 


The coefficient of determination is a standard measure of effectiveness for a 
regression model that measures how well the model fits the data. It is a number between 
0 and 1 that measures the squared correlation between the observed values of y and the 
predictions produced by the model. The value of R? measures the proportion of the total 
variation of the dependent variable which is explained by the independent variables. The 
higher the number, the better the fit. The equation may be expressed as: 


Eloi -$)] 


i=] 


(6-1) 


where y; = the actual price of jet fuel 


y | = the forecast price of jet fuel 
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y = the average actual price of jet fuel 


$= the average forecast price of jet fuel 


2. Mean Squared Error (MSE) 


A commonly used measure for summarizing historical errors is the mean squared 
error. Ihe MSE is the average of the squared errors that measures the deviation of the 
forecasts from the actuals. The squaring process does not differentiate whether the error 


is positive or negative. It may be expressed as: 


i 


> (yi= ji) 
MSE S22 (6-2) 
hi 


where n = the number of data points in the subset. MSE penalizes a forecasting technique 


much more heavily for larger errors than for smaller ones. (Tersine, p. 42-43) 


3. Mean Absolute Percent Error (MAPE) 
The mean absolute percent of error is similar to a percentage form of MSE, but it 
does not square the deviations. It also does not differentiate whether errors are positive or 


negative. It is expressed as: 


1005 | yi- Y 
iz] 
yi 
MAPE = ~———______+ (6-3) 
n 


4. Mean Absolute Deviation (MAD) 

Mean absolute deviation is another commonly used measure for summarizing 
historical error. The MAD is the average absolute error that measures the deviation of the 
forecast, and does not differentiate whether the error is positive or negative. MAD is 


more forgiving than MSE for larger errors. Thus the smaller the MAD the better the 


forecast. MAD may be expressed as: 
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MAD = i= —__ (6-4) 


5. Minimum and Maximum Absolute Error 


The minimum and maximum absolute error are indications of the minimum 
magnitude and maximum magnitude of error is found. Minimum absolute error indicates 
how close the nearest data point is to the fitted line. Maximum absolute error indicates 


how much error does the utmost outlier contribute. They may be expressed as: 
Minimum Absolute Error = Min{ lyi- yil} (6-5) 
Maximum Absolute Error = Max{|y: : yl} (6-6) 


B. COMPARATIVE ANALYSIS 


When building an artificial neural network, a training set and test set are used. 
The data set under consideration is partitioned into three subsets: a training set of 101 
data points, a test set of 44 points, and the entire working set of 145 points. 70% of the 
data was randomly picked by Predict for the training set. The remaining data was 
included in the test set. The criteria to be in the test set was to be not in the training set. 
Once sorted, an analysis of data of the same results of both STIFS and the artificial neural 
network (ANN) modeled in Predict can be done. The artificial neural network model 
built in Predict will be referred to as Predict ANN. 

Table 6-1 summarizes the results of the training sets. Predict ANN outperformed 
the STIFS model overwhelmingly in four of six measures of effectiveness. The 
coefficient of determination was a mere 0.000372 higher than Predict. An interesting 


occurrence was the significant decrease in most measures of error with Predict ANN. 
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NEURALWORKS 
PREDICT ANN 


DOE 
STIFS MODEL 


Measure of Effectiveness 





















Coefficient of 0.987734 0.987362 


Determination (R’) 





6.544775 3.123157 






Mean Squared Error 








Mean Absolute Percent Error | 2.564670 1.90911] 









Mean Absolute Deviation 1.741406 1.294944 









Minimum Absolute Error 0.008300 0.021603 









Maximum Absolute Error 15.677000 7.445343 






TABLE 6-1. Training Set Measures of Effectiveness (size n= 101). 


The mean squared error was 2.821618 less in Predict ANN. The mean absolute 
error was 0.655559 less in Predict ANN. The mean absolute error was 0.013303 less in 
Predict ANN. The minimum absolute error value produced by STIFS is irrelevant 
because of the nature of fitting a line with regression. However, Predict ANN does 
minimize the error of the outliers. Predict ANN calculated the MSE 43.1% lower than 
STIFS, MAPE 25.6.% lower and MAD 25.6% lower. 

Table 6-2 summarizes the test set data. The test set is designated as the 
complement of the training set in this study. This produced a subset of 44 points. The 
comparison of the two models revealed Predict ANN outperforming STIFS in five out of 
six categories. The lone category that STIFS outperformed Predict ANN was the 
maximum absolute error computation. The difference of 9.4% is quite small. Predict 
ANN again overwhelmingly minimized the error found in the model better than STIFS 
with MSE 20.5% lower, MAPE 30.8% and MAD 23.2% lower than the forecasts 
produced by STIFS. 
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Measure of Effectiveness 


Coefficient of 
Determination (R’) 


Mean Squared Error 

Mean Absolute Percent Error 
Mean Absolute Deviation 
Minimum Absolute Error 


Maximum Absolute Error 


DOE 


STIFS MODEL 


0.998981 


5.016221 


2.314592 


1.550638 


0.077100 


7.703000 


NEURALWORKS PREDICT 


0.999204 


3.987680 


1.602391 


1.190390 


0.049209 


8.428842 








TABLE 6-2. Test Set Measures of Effectiveness (size n=44). 
Table 6-3 summarizes the combined data, and Predict ANN outperformed STIFS 


in five out of six measures of effectiveness again. Predict ANN’s coefficient of 
determination was calculated as 0.008797 less. The mean squared error, mean absolute 
percent error, and mean absolute error were all 2.277512, 0.672748 and 0.420301 less 
respectively. These calculations possess 37.5%, 27.0% and 25.0% less error when 
computing MSE, MAPE and MAD respectively. 


Measure of Effectiveness DOE 
STIFS MODEL 


NEURALWORKS PREDICT 


Coefficient of 
Determination (R’) 


0.977071 0.985868 


Mean Squared Error 6.080938 3.803426 


Mean Absolute Percent Error | 2.488785 1.816037 


Mean Absolute Deviation 1.683518 1.263217 


Minimum Absolute Error 0.008300 0.021603 


Maximum Absolute Error 15.677000 8.428842 





TABLE 6-3. Training Set Measures of Effectiveness (size n=145). 
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VII. CONCLUSIONS 


A. SUMMARY 


This thesis has provided a view into the area of modeling using artificial neural 
networks. An introduction to neural networks was provided, and two recent studies of 
forecasting commodities prices were reviewed. The jet fuel price segment of the 
Department of Energy’s Short Term Integrated Forecasting System model was examined, 
and computations using twelve years of data were compared to the output of a neural 


network developed using NeuralWorks Predict. 


B. RESEARCH QUESTIONS 


The research questions posed in Chapter I are addressed as follows: 


1. Primary Research Question 


Can jet fuel prices be adequately predicted with a neural network model? Yes, it 
is possible to build a statistically sound artificial neural network with a commercially 
available software package such as Neural Works Predict and obtain more accurate results 
than with a conventional modeling approach such as regression. The Predict artificial 
neural network model reduced the contribution of outliers more effectively than the 


STIFS regression model, thus producing a more robust model. 


2. Subsidiary Research Questions 


Would an artificial neural network model provide better forecasting results than 
more common approaches such as an econometric regression model, specifically, the 
Department of Energy’s Short Term Integrated F. orecasting System (STIFS) model? Yes, 
the artificial neural network model provided convincing results outperforming the STIFS 
regression model in six out of seven areas of measured effectiveness over a twelve year 
period using monthly data. The NeuralWorks Predict model yielded a better coefficient 
of determination, correlation coefficient, mean squared error, mean absolute percent error, 


mean absolute deviation and maximum absolute error. 
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Would an artificial neural network model provide a useful planning and decision 
aid for the Defense Fuel Supply Center (DFSC)? Yes, with the advent of user friendly 
commercially available software packages such as Neural Works Predict, DFSC would 
benefit from the further investigation of artificial neural networks in forecasting noisy 
data sets such as fuel. By reducing the error of the forecasts, better budgetary decisions 
may be made. Today’s software applications are designed to work in commonly used 


spreadsheet environments. 


C. AREAS OF FURTHER RESEARCH 


The researcher examined the use of neural networks to predict prices for jet fuel 
prices. This model could be modified and expanded to effectively project prices for 
different petroleum products as well as other commodities. 

Artificial neural networks can be employed beyond the pedestrian applications of 
commodity price prediction. The Navy’s Inventory Control Points at the Ship’s Parts 
Control Center and the Aviation Supply Office as well as the Defense Logistics Agency 
(DLA) should examine recent expanded applications and apply these concepts to the 
areas of consumable and repairable parts Management. A more robust model that 
incorporates actual fleet flight hours flown, fleet hours steamed, or some other measure of 
fleet activity level could potentially assist stock points in raising the supply management 
availability levels significantly without a corresponding dramatic increase in capital 
outlay costs. 

Companies such as TRW have performed extensive published research on 
applying artificial neural networks to solving the credit assignment problem using the 
madaline (many adaptive linear neurons) artificial neural network models. The credit 
assignment problem is the decision of whether or not to grant lines of credit to 
individuals. This type of decision tool could be expanded for evaluating government 
contractor performance. This application would assist the government’s increasing 
interest in expanding the JIT (just-in-time) philosophy that decreases inventory control 


costs and increases the importance of high quality suppliers. A decision tool that assists 
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in evaluating the uncertainty of potential stockouts by slippage of contractor delivery 


dates could be realized. 


D. RECOMMENDATIONS 


1. DLA needs to enhance its forecasting strategies by exploring the potential 
power of artificial neural networks. DFSC should expand this model to forecast other 
commodities of interest. 

2. There are facilities near DFSC that possess expertise in artificial neural 
networks. The Naval Research Laboratory currently conducts neural network research 
and would be available as an expert information source. The University of Maryland has 
several renowned professors in the field, and The International Neural Network Society, 


based in Washington, DC, are other local sources for information. 


4] 
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